Multilingual Summarization with Polytope Model

نویسندگان

  • Natalia Vanetik
  • Marina Litvak
چکیده

The problem of extractive text summarization for a collection of documents is defined as the problem of selecting a small subset of sentences so that the contents and meaning of the original document set are preserved in the best possible way. In this paper we describe the linear programming-based global optimization model to rank and extract the most relevant sentences to a summary. We introduce three different objective functions being optimized. These functions define a relevance of a sentence that is being maximized, in different manners, such as: coverage of meaningful words of a document, coverage of its bigrams, or coverage of frequent sequences of words. We supply here an overview of our system’s participation in the MultiLing contest of SIGDial 2015.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Multi-Document Summarization with POLY2

In this paper we present a linear model for the problem of text summarization, where a summary preserves the information coverage as much as possible in comparison to the original document set. We reduce the problem of finding the best summary to the problem of finding the point on a convex polytope closest to the given hyperplane, and solve it efficiently with the help of fractional linear pro...

متن کامل

Polytope Model for Extractive Summarization

The problem of text summarization for a collection of documents is defined as the problem of selecting a small subset of sentences so that the contents and meaning of the original document set are preserved in the best possible way. In this paper we present a linear model for the problem of text summarization, where we strive to obtain a summary that preserves the information coverage as much a...

متن کامل

ExB Text Summarizer

We present our state of the art multilingual text summarizer capable of single as well as multi-document text summarization. The algorithm is based on repeated application of TextRank on a sentence similarity graph, a bag of words model for sentence similarity and a number of linguistic preand post-processing steps using standard NLP tools. We submitted this algorithm for two different tasks of...

متن کامل

A Platform for Multilingual News Summarization

We have developed a multilingual version of Columbia Newsblaster as a testbed for multilingual multi-document summarization. The system collects, clusters, and summarizes news documents from sources all over the world daily. It crawls news sites in many different countries, written in different languages, extracts the news text from the HTML pages, uses a variety of methods to translate the doc...

متن کامل

CIST System Report for ACL MultiLing 2013 ‐ Track 1: Multilingual Multi-document Summarization

This report provides a description of the methods applied in CIST system participating ACL MultiLing 2013. Summarization is based on sentence extraction. hLDA topic model is adopted for multilingual multi-document modeling. Various features are combined to evaluate and extract candidate summary sentences.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015